Collective Classification for Spam Filtering

نویسندگان

Carlos Laorden

Borja Sanz

Igor Santos

Patxi Galán-García

Pablo García Bringas

چکیده

Spam has become a major issue in computer security because it is a channel for threats such as computer viruses, worms and phishing. Many solutions feature machine-learning algorithms trained using statistical representations of the terms that usually appear in the e-mails. Still, these methods require a training step with labelled data. Dealing with the situation where the availability of labelled training instances is limited slows down the progress of filtering systems and offers advantages to spammers. Currently, many approaches direct their efforts into Semi-Supervised Learning (SSL). SSL is a halfway method between supervised and unsupervised learning, which, in addition to unlabelled data, receives some supervision information such as the association of the targets with some of the examples. Collective Classification for Text Classification poses as an interesting method for optimising the classification of partially-labelled data. In this way, we propose here, for the first time, Collective Classification algorithms for spam filtering to overcome the amount of unclassified e-mails that are sent every day.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Collective Classification of Social Network Spam

Unsolicited or unwanted messages is a byproduct of virtually every popular social media website. Spammers have become increasingly proficient at bypassing conventional spam filters, prompting a stronger effort to develop new methods that accurately detect spam while simultaneously acting as a more robust classifier against users that modify their behavior in order to avoid detection. This paper...

متن کامل

Classifying Unsolicited Bulk Email (UBE) using Python Machine Learning Techniques

Email has become one of the fastest and most economical forms of communication. However, the increase of email users has resulted in the dramatic increase of spam emails during the past few years. As spammers always try to find a way to evade existing filters, new filters need to be developed to catch spam. Generally, the main tool for email filtering is based on text classification. A classifi...

متن کامل

Single-Class Learning for Spam Filtering: An Ensemble Approach

Spam, also known as Unsolicited Commercial Email (UCE), has been an increasingly annoying problem to individuals and organizations. Most of prior research formulated spam filtering as a classical text categorization task, in which training examples must include both spam emails (positive examples) and legitimate mails (negatives). However, in many spam filtering scenarios, obtaining legitimate ...

متن کامل

A Novel Method for Detecting Spam Email using KNN Classification with Spearman Correlation as Distance Measure

E-mail is the most prevalent methods for correspondence because of its availability, quick message exchange and low sending cost. Spam mail appears as a serious issue influencing this application today's internet. Spam may contain suspicious URL’s, or may ask for financial information as money exchange information or credit card details. Here comes the scope of filtering spam from legitimate em...

متن کامل

Support Vector Machines Parameter Selection Based on Combined Taguchi Method and Staelin Method for E-mail Spam Filtering

Support vector machines (SVM) are a powerful tool for building good spam filtering models. However, the performance of the model depends on parameter selection. Parameter selection of SVM will affect classification performance seriously during training process. In this study, we use combined Taguchi method and Staelin method to optimize the SVM-based E-mail Spam Filtering model and promote spam...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2011

Collective Classification for Spam Filtering

نویسندگان

چکیده

منابع مشابه

Collective Classification of Social Network Spam

Classifying Unsolicited Bulk Email (UBE) using Python Machine Learning Techniques

Single-Class Learning for Spam Filtering: An Ensemble Approach

A Novel Method for Detecting Spam Email using KNN Classification with Spearman Correlation as Distance Measure

Support Vector Machines Parameter Selection Based on Combined Taguchi Method and Staelin Method for E-mail Spam Filtering

عنوان ژورنال:

اشتراک گذاری